{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "edd4512a",
   "metadata": {},
   "source": [
    "\n",
    "# Introduction to Python for Biostatistics\n",
    "\n",
    "This notebook introduces Python programming concepts for biostatistics, including basic operations, data manipulation, and exploratory analysis.\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8fa0c4bc",
   "metadata": {},
   "source": [
    "\n",
    "## Dataset Information\n",
    "\n",
    "For practicing biostatistical techniques, any dataset with numerical and categorical variables can be used. Here, you can use the **Diabetes dataset** for exploratory analysis.\n",
    "\n",
    "1. **Kaggle:**\n",
    "   - [Diabetes Dataset - Kaggle](https://www.kaggle.com/datasets/mathchi/diabetes-data)\n",
    "\n",
    "### Dataset Attributes\n",
    "\n",
    "- **Pregnancies**: Number of pregnancies.\n",
    "- **Glucose**: Plasma glucose concentration.\n",
    "- **BloodPressure**: Diastolic blood pressure (mm Hg).\n",
    "- **SkinThickness**: Triceps skinfold thickness (mm).\n",
    "- **Insulin**: 2-Hour serum insulin (mu U/ml).\n",
    "- **BMI**: Body mass index (weight in kg/(height in m)^2).\n",
    "- **DiabetesPedigreeFunction**: Diabetes pedigree function.\n",
    "- **Age**: Age of the patient.\n",
    "- **Outcome**: Class variable (0 = non-diabetic, 1 = diabetic).\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c6f0ece",
   "metadata": {},
   "source": [
    "\n",
    "## Basic Python Operations\n",
    "\n",
    "Learn how to perform basic operations in Python such as addition, string manipulation, and comments.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "674cef1f",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Addition\n",
    "print(\"Addition example:\", 8 + 9)\n",
    "\n",
    "# String manipulation\n",
    "name = \"Python\"\n",
    "print(f\"Welcome to {name} programming!\")\n",
    "\n",
    "# Comments\n",
    "# This is a comment and will not be executed\n",
    "print(\"Comments make code readable.\")\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "170180b1",
   "metadata": {},
   "source": [
    "\n",
    "## Variables and Data Types\n",
    "\n",
    "Explore variables and data types, including integers, floats, and strings.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1ca220b5",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Define variables\n",
    "days_in_year = 365\n",
    "pi_value = 3.14159\n",
    "language = \"Python\"\n",
    "\n",
    "# Display variables and their types\n",
    "print(f\"Days in a year: {days_in_year}, Type: {type(days_in_year)}\")\n",
    "print(f\"Value of Pi: {pi_value}, Type: {type(pi_value)}\")\n",
    "print(f\"Programming language: {language}, Type: {type(language)}\")\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9191885",
   "metadata": {},
   "source": [
    "\n",
    "## Loading Data\n",
    "\n",
    "Learn how to load data using pandas, a powerful library for data manipulation.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7a24d9bb",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import pandas as pd\n",
    "\n",
    "# Load the dataset (replace with your file path)\n",
    "# Example: data = pd.read_csv('C:\\Path\\to\\diabetes.csv')\n",
    "data = pd.DataFrame({\n",
    "    \"Age\": [25, 30, 35],\n",
    "    \"BMI\": [22.5, 24.3, 28.7],\n",
    "    \"Outcome\": [0, 1, 0]\n",
    "})\n",
    "\n",
    "# Display the first few rows\n",
    "print(data.head())\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8c6fcad",
   "metadata": {},
   "source": [
    "\n",
    "## Basic Data Exploration\n",
    "\n",
    "Perform basic exploratory operations on the dataset.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c97d030c",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Display data information\n",
    "print(data.info())\n",
    "\n",
    "# Display summary statistics\n",
    "print(data.describe())\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "913eccbd",
   "metadata": {},
   "source": [
    "\n",
    "## Visualizing Data\n",
    "\n",
    "Create basic visualizations to explore the dataset.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28bdf41c",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import seaborn as sns\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# Histogram for Age\n",
    "sns.histplot(data['Age'], kde=True)\n",
    "plt.title(\"Age Distribution\")\n",
    "plt.xlabel(\"Age\")\n",
    "plt.ylabel(\"Frequency\")\n",
    "plt.show()\n",
    "        "
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 5
}
